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Abstract 

We consider various stochastic models that incorporate the notion of risk-averseness into the standard 
2-stage recourse model, and develop novel techniques for solving the algorithmic problems arising in 
these models. A key notable feature of our work that distinguishes it from work in some other related 
models, such as the (standard) budget model and the (demand-) robust model, is that we obtain results 
in the black-box setting, that is, where one is given only sampling access to the underlying distribution. 
Our first model, which we call the risk-averse budget model, incorporates the notion of risk-averseness 
via a probabilistic constraint that restricts the probability (according to the underlying distribution) with 
which the second-stage cost may exceed a given budget B to at most a given input threshold p. We also 
a consider a closely-related model that we call the risk-averse robust model, where we seek to minimize 
the first-stage cost and the (1 — p)-quantile (according to the distribution) of the second-stage cost. 

We obtain approximation algorithms for a variety of combinatorial optimization problems including 
the set cover, vertex cover, multicut on trees, min cut, and facility location problems, in the risk-averse 
budget and robust models with black-box distributions. We first devise a fully polynomial approximation 
scheme for solving the LP -relaxations of a wide-variety of risk-averse budgeted problems. Complement- 
ing this, we give a rounding procedure that lets us use existing LP-based approximation algorithms for 
the 2-stage stochastic and/or deterministic counterpart of the problem to round the fractional solution. 
Thus, we obtain near-optimal solutions to risk-averse problems that preserve the budget approximately 
and incur a small blow-up of the probability threshold (both of which are unavoidable). To the best of 
our knowledge, these are the first approximation results for problems involving probabilistic constraints 
and black-box distributions. Our results extend to the setting with non-uniform scenario-budgets, and to 
a generalization of the risk-averse robust model, where the goal is to minimize the sum of the first-stage 
cost and a weighted combination of the expectation and the (1 — ( o)-quantile of the second-stage cost. 

1 Introduction 

Stochastic optimization models provide a means to model uncertainty in the input data where the uncertainty 
is modeled by a probability distribution over the possible realizations of the actual data, called scenarios. 
Starting with the work of Dantzig [10] and Beale [2] in the 1950s, these models have found increasing 
application in a wide variety of areas; see, e.g., [4, 35] and the references therein. An important and widely- 
used model in stochastic programming is the 2-stage recourse model: first, given the underlying distribution 
over scenarios, one may take some first-stage actions to construct an anticipatory part of the solution, x, 
incurring an associated cost c(x). Then, a scenario A is realized according to the distribution, and one 
may take additional second-stage recourse actions da incurring a certain cost /a(^i2/a)« The goal in the 
standard 2-stage model is to minimize the total expected cost, c(x) + F>a [/a(x, JJa)] ■ Many applications 
come under this setting. An oft-cited motivating example is the 2-stage stochastic facility location problem. 
A company has to decide where to set up its facilities to serve client demands. The demand-pattern is not 
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known precisely at the outset, but one does have some statistical information about the demands. The first- 
stage decisions consist of deciding which facilities to open initially, given the distributional information 
about the demands; once the client demands are realized according to this distribution, we can extend the 
solution by opening more facilities, incurring their recourse costs. The recourse costs are usually higher 
than the original ones (e.g., because opening a facility later involves deploying resources with a small lead 
time), could be different for the different facilities, and could even depend on the realized scenario. 

A common criticism of the standard 2-stage model is that the expectation measure fails to adequately 
measure the "risk" associated with the first-stage decisions: two solutions with the same expected cost are 
valued equally. But in realistic settings, one also considers the risk involved in the decision. For example, in 
the stochastic facility location problem, given two solutions with the same expected cost, one which incurs a 
moderate second-stage cost in all scenarios, and one where there is a non-negligible probability that a "dis- 
aster scenario" with a huge associated cost occurs, a company would naturally prefer the former solution. 

Our models and results. We consider various stochastic models that incorporate the notion of risk- 
averseness into the standard 2-stage model and develop novel techniques for solving the algorithmic prob- 
lems arising in these models. A key notable feature of our work that distinguishes it from work in some other 
related models [19, 11], is that we obtain results in the black-box setting, that is, where one is given only 
sampling access to the underlying distribution. To better motivate our models, we first give an overview of 
some related models considered in the approximation-algorithms literature that also embody the idea of risk- 
protection, and point out why these models are ill-suited to the design of algorithms in the black-box setting. 

One simple and natural way of providing some assurance against the risk due to scenario-uncertainty is 
to provide bounds on the second-stage cost incurred in each scenario. Two closely related models in this 
vein are the budget model, considered by Gupta, Ravi and Sinha [19], and the (demand-) robust model, 
considered by Dhamdhere, Goyal, Ravi and Singh [11]. In the budget model, one seeks to minimize the 
expected total cost subject to the constraint that the second-stage cost Ja{x, ua) incurred in every scenario 
A be at most some input budget B. (In general, one could have a different budget Ba for each scenario A, 
but for simplicity we focus on the uniform-budget model.) Gupta et al. considered the budget model in the 
polynomial scenario setting, where one is given explicitly a list of all scenarios (with non-zero probability) 
and their probabilities, thereby restricting their attention to distributions with a polynomial-size support. In 
the robust model considered by Dhamdhere et al. [11], which is more in the spirit of robust optimization, 
the goal is to minimize c(x) + max^ Ja (x, jja)- It is easy to see how the two models are related: if one 
"guesses" the maximum second-stage cost B incurred by the optimum, then the robust problem essentially 
reduces to the budget problem with budget B, except that the second-stage cost term in the objective func- 
tion is replaced by B (which is a constant). Notice that it is not clear how to even specify problems with 
exponentially many scenarios in the robust model. Feige et al. [14] expanded the model of [11] by consid- 
ering exponentially many scenarios, where the scenarios are implicitly specified by a cardinality constraint. 
However, considering scenario-collections that are determined only by a cardinality constraint seems rather 
specialized and somewhat artificial, especially in the context of stochastic optimization; e.g., in facility loca- 
tion, it is rather stylized (and overly conservative) to assume that every set of k clients (for some k) may show 
up in the second-stage. We will consider a more general way of specifying (exponentially many) scenarios 
in robust problems, where the input specifies a black-box distribution and the collection of scenarios is then 
given by the support of this distribution. We shall call this model the distribution-based robust-model. 

Both the budget model and the (distribution-based) robust model suffer from certain common drawbacks. 
A serious algorithmic limitation of both these models (see Section 5) is that for almost any (non-trivial) 
stochastic problem (such ^fractional stochastic set cover with at most 3 elements, 3 sets, 3 scenarios), one 
cannot obtain any approximation guarantees in the black-box setting using any bounded number of samples 
(even allowing for a bounded violation of the budget in the budget model). Intuitively, the reason for 
this is that there could be scenarios that occur with vanishingly small probability that one will almost never 
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encounter in our samples, but which essentially force one to take certain first-stage actions in order to satisfy 
the budget constraints in the budget model, or to obtain a low-cost solution in the robust model. Notice also 
that both the budget and robust models adopt the conservative view that one needs to bound the second- 
stage cost in every scenario, regardless of how likely it is for the scenario to occur. (By the same token, 
they also provide the greatest amount of risk-aversion.) In contrast, many of the risk-models considered 
in the finance and stochastic-optimization literature, such as the mean-risk model [27], value-at-risk (VaR) 
constraints [30, 23, 32], conditional VaR [34], do factor in the probabilities of different scenarios. 

Our models for risk-averse stochastic optimization address the above issues, and significantly refine 
and extend the budget and robust models. Our goal is to come up with a model that is sufficiently rich in 
modeling power to allow for black-box distributions, and in which one can obtain strong algorithmic results. 
Our models are motivated by the observation (see Appendix A) that it is possible to obtain approximation 
guarantees in the budget model with black-box distributions, if one allows the second-stage cost to exceed 
the budget with some "small" probability p (according to the underlying distribution). We can turn this 
solution concept around and incorporate it into the model to arrive at the following. We are now given a 
probability threshold p £ [0, 1]. In our new budget model, which we call the risk-averse budget model, 
given a budget B, we seek (x, {va}) so as to minimize c(x) + [Ja(x, Va)] subject to the probabilistic 
constraint PrA[fA( x ,UA) > B] < p. The corresponding risk-averse (distribution-based) robust model 
seeks to minimize c(x) + Q p [/a(x, va)}, where Q p [fA(x, va)] is the (1 - p)-quantile of {/a{x, Va)}agA, 
which is the smallest number B such that Pta[/a(x) > B] < p. Notice that the parameter p allows us 
to control the risk-aversion level and tradeoff risk-averseness against conservatism (in the spirit of [3, 41]). 
Taking p = 1 in the risk-averse budget model gives the standard 2-stage recourse model, whereas taking p = 
in the risk-averse budget- or robust-models recovers the standard budget- and robust models respectively. 
In the sequel, we treat p as a constant that is not part of the input. 

We obtain approximation algorithms for a variety of combinatorial optimization problems (Section 4) 
including the set cover, vertex cover, multicut on trees, min cut, and facility location problems, in the 
risk-averse budget and robust models with black-box distributions. We obtain near-optimal solutions that 
preserve the budget approximately and incur a small blow-up of the probability threshold. (One should 
expect to violate the budget here; otherwise, by setting very high first-stage costs, one would be able to solve 
the decision version of an A^-hard problem!) To the best of our knowledge, these are the first approximation 
results for problems with probabilistic constraints and black-box distributions. Our results extend to the 
setting with non-uniform scenario-budgets, and to a generalization of the risk-averse robust model, where 
the goal is to minimize c(x) plus a weighted combination of [/a(x, va)] and Q p [fA{x,yA)]- In the 
sequel, we focus primarily on the risk-averse budget model since results obtained this model essentially 
translate to the risk-averse robust model (the budget-violation can be absorbed into the approximation ratio). 

Our results are built on two components. First, and this is the technically more difficult component and 
our main contribution, we devise a fully polynomial approximation scheme for solving the LP-relaxations 
of a wide-variety of risk-averse problems (Theorem 3.3). We show that in the black-box setting, for a wide 
variety of 2-stage problems, for any e, n > 0, in time poly(^), one can compute (with high probability) a 
solution to the LP-relaxation of the risk-averse budgeted problem, of cost at most (1 + e) times the optimum 
where the probability that the second-stage cost exceeds the budget B is at most p(l +k). Here A is the max- 
imum ratio between the costs of the same action in stage II and stage I (e.g., opening a facility or choosing a 
set). We show in Section 5 that the dependence on and hence, the violation of the probability-threshold, 
is unavoidable in the black-box setting. We believe that this is a general tool of independent interest that 
will find application in the design of approximation algorithms for other discrete risk-averse stochastic op- 
timization problems, and that our techniques will find use in solving other probabilistic programs. 

The second component is a simple rounding procedure (Theorem 3.2) that complements (and motivates) 
the above approximation scheme. As we mention below, our LP-relaxation is a relaxation of even the 
fractional risk-averse problem (i.e., where one is allowed to take fractional decisions). We give a general 
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rounding procedure to convert a solution to our LP-relaxation to a solution to the fractional risk-averse 
problem losing a certain factor in the solution cost, budget, and the probability of budget-violation. This 
allows us to then use an LP-based "local" approximation algorithm for the corresponding 2-stage problem to 
obtain an integer solution, where a local algorithm is one that approximately preserves the LP-cost of each 
scenario. In particular, for various covering problems, one can use the local 2c-approximation algorithm 
in [38], which is obtained using an LP-based c-approximation algorithm for the deterministic problem. 

We need to overcome various obstacles to devise our approximation scheme. The first difficulty faced 
in solving a probabilistic program such as ours, is that the feasible region of even the fractional problem 
is a non-convex set. Thus, even in the polynomial-scenario setting, it is not clear how to solve (even) the 
fractional risk-averse problem. (In contrast, in the standard 2-stage recourse model, the fractional problem 
can be easily formulated and solved as a linear program (LP) in the polynomial-scenario setting.) We 
formulate an LP-relaxation (which is also a relaxation of the fractional problem), where we introduce a 
variable ta for every scenario A that is supposed to indicate whether the budget is exceeded in scenario 
A. Correspondingly, we have two sets of decision variables to denote the decisions taken in scenario A 
in the two cases respectively where the budget is exceeded and where it is not exceeded. The constraints 
that enforce this semantics will of course be problem-specific, but a common constraint that figures in all 
these formulations is Y1,a PA r A < P, which captures our probabilistic constraint. This constraint, which 
couples the different scenarios, creates significant challenges in solving the LP-relaxation. (Again, notice 
the contrast with the standard 2-stage recourse model.) We get around the difficulty posed by this coupling 
constraint by taking the Lagrangian dual with respect to this constraint, introducing a dual variable A > 0. 
The resulting maximization problem (over A) has a 2-stage minimization LP embedded inside it. Although 
this 2-stage LP does not belong to the class of problems defined in [38, 45, 7], we prove that for any fixed A, 
this 2-stage LP can be solved to "near-optimality" using the sample average approximation (SAA) method. 
The crucial insight here is to realize that for the purpose of obtaining a near-optimal solution to the risk- 
averse LP, it suffices to obtain a rather weak guarantee for the 2-stage LP, where we allow for an additive 
error proportional to A. This guarantee is specifically tailored so that it is weak enough that one can prove 
such a guarantee by showing "closeness-in-subgradients" and the analysis in [45], and yet can be leveraged 
to obtain a near-optimal solution to (the relaxation of) our risk-averse problem. Given this guarantee, we 
show that one can efficiently find a suitable value for A such that the solution obtained for this A (via the 
SAA method) satisfies the desired guarantees. 

Related work. Stochastic optimization is a field with a vast amount of literature; we direct the reader to [4, 
30, 35] for more information on the subject. We survey the work that is most relevant to our work. Stochastic 
optimization problems have only recently been studied from an approximation-algorithms perspective. A 
variety of approximation results have been obtained in the 2-stage recourse model, but more general models, 
such as risk-optimization or probabilistic -programming models have received little or no attention. 

The (standard) budget model was first considered by Gupta et al. [19], who designed approximation 
algorithms for stochastic network design problems in this model. Dhamdhere et al. [11] introduced the 
demand-robust model (which we call the robust model), and obtained algorithms for the robust versions of 
various combinatorial optimization problems; some of their guarantees were later improved by Golovin et 
al. [16]. All these works focus on the polynomial-scenario setting. Feige, Jain, Mahdian, and Mirrokni [14] 
considered the robust model with exponentially many scenarios that are specified implicitly via a cardinality 
constraint, and derived approximation algorithms for various covering problems in this more general model. 

There is a large body of work in the finance and stochastic-optimization literature, dating back to 
Markowitz [27], that deals with risk-modeling and optimization; see e.g., [34, 1, 36] and the references 
therein. Our risk-averse models are related to some models in finance. In fact, the probabilistic constraint 
that we use is called a value-at-risk (VaR) constraint in the finance literature, and its use in risk-optimization 
is quite popular in finance models; it has even been written into some industry regulations [23, 32]. 
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Problems involving probabilistic constraints are called probabilistic or chance-constrained programs [8, 
29] in the stochastic-optimization literature, and have been extensively studied (see, e.g., Prekopa [31]). Re- 
cent work in this area [6, 28, 13] has focused on replacing the original probabilistic constraint by more 
tractable constraints so that any solution satisfying the new constraints also satisfies the original probabilis- 
tic constraint with high probability. Notice that this type of "relaxation" is opposite to what one aims for 
in the design of approximation algorithms, where we want that every solution to the original problem re- 
mains a solution to the relaxation (but most likely, not vice versa). Although some approximation results 
in the opposite direction are obtained in [6, 28, 13], they are obtained for very structured constraints of 
the type Pr^[G(x,^) ^ C] < p, where C is a convex set, £ is a continuous random variable whose dis- 
tribution satisfies a certain concentration-of-measure property, and G(.) is a bi-affine or convex mapping; 
also the bounds obtained involve a relatively large violation of the probability threshold (compared to our 
(1 + K)-factor). To the best of our knowledge, there is no prior work in the stochastic-optimization or 
finance literature on the design of efficient algorithms with provable worst-case guarantees for discrete risk- 
optimization or probabilistic -programming problems. In the Computer Science literature, [24] and [15] 
consider the stochastic bin packing and knapsack problems with probabilistic constraints that limit the over- 
flow probability of a bin or the knapsack, and obtained novel approximation algorithms for these problems. 
Their results are however obtained for specialized distributions where the item sizes are independent random 
variables following Bernoulli, exponential, or Poisson distributions specified in the input. In the context of 
stochastic optimization, this constitutes a rather stylized setting that is far from the black-box setting. 

The work closest in spirit to ours is that of So, Zhang, and Ye [41]. They consider the problem of 
minimizing the first-stage cost plus a risk-measure called the conditional VaR (CVaR) [34]. Their model 
interpolates between the 2-stage recourse model and the (standard) robust model (as opposed to the budget 
model in our case). They give an approximation scheme for solving the LP-relaxations of a broad class of 
problems in the black-box setting, using which they obtain approximation algorithms for certain discrete 
optimization problems. Our methods are however quite different from theirs. In their model, the fractional 
problem yields a convex program and moreover, they are able to use a nice representation theorem in [34] 
for the CVaR measure to convert their problem into a 2-stage problem and then adapt the methods in [7]. 
In our case, the non-convexity inherent in the probabilistic constraint creates various difficulties (first the 
non-convexity, then the coupling constraint) and we consequently need to work harder to obtain our result. 
We remark that our techniques can be used to solve a generalization of their model, where we have the same 
objective function but also include a probabilistic budget constraint as in our risk-averse budget model. 

We now briefly survey the approximation results in recourse models. The first such approximation re- 
sult appears to be due to Dye, Stougie, and Tomasgard [12]. The recent interest and flurry of algorithmic 
activity in this area can be traced to the work of Ravi and Sinha [33] and Immorlica, Karger, Minkoff and 
Mirrokni [22], which gave approximation algorithms for the 2-stage variants of various discrete optimiza- 
tion problems in the polynomial scenario [33, 22] and independent-activation [22] settings. Approximation 
algorithms for 2-stage problems with black-box distributions were first obtained by Gupta, Pal, Ravi and 
Sinha [17], and subsequently by Shmoys and Swamy [38] (see also preliminary version [39]). Various other 
approximation results for 2-stage problems have since been obtained; see, e.g., the survey [44]. Multi- 
stage recourse problems in the black-box model were considered by [18, 45]; both obtain approximation 
algorithms with guarantees that deteriorate with the number of stages, either exponentially [18] (except for 
multistage Steiner tree which was also considered in [20]), or linearly [45]; improved guarantees for set 
cover and vertex cover have been subsequently obtained [42]. 

Our approximation scheme makes use of the SAA method, which is a simple and appealing method for 
solving stochastic problems that is quite often used in practice. In the SAA method one samples a certain 
number of scenarios to estimate the scenario probabilities by their frequency of occurrence, and then solves 
the 2-stage problem determined by this approximate distribution. The effectiveness of this method depends 
on the sample size (ideally, polynomial) required to guarantee that an optimal solution to the SAA-problem 
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is aprovably near-optimal solution to the original problem. Kleywegt et al. [25] (see also [37]) prove abound 
that depends on the variance of a certain quantity that need not be polynomially bounded. Subsequently, 
Swamy and Shmoys [45], and Charikar et al. [7] obtained improved (polynomial) sample-bounds for a large 
class of structured 2-stage problems. The proof in [45], which also applies to multistage programs, is based 
on leveraging approximate subgradients, and our proof makes use of portions of their analysis. The proof of 
Charikar et al. [7] is quite different; it applies to 2-stage programs but proves the stronger theorem that even 
approximate solutions to the SAA problem translate to approximate solutions to the original problem. 

2 Preliminaries 

Let M + denote M>o- Let ||it|| denote the £2 norm of u. The Lipschitz constant of a function / : W 71 R 
is the smallest K such that \f{x) — f(y)\ < K\\x — y\\. We consider convex minimization problems 
mm x6 -p f(x), where V C with V C B(0, R) = {x : \\x\\ < R} for a suitable R, and / is convex. 

Definition 2.1 Let f : W 71 1— ► R be a function. We say that d G W 71 is a subgradient of f at the point u if 
the inequality f{v) — f(u) > d ■ (v — u) holds for every v G M m . We say that d is an (u, £) -subgradient of 
f at the point u G V if for every v G V, we have f(v) — f(u) > d ■ (v — u) — ujf(v) — ujf(u) — £. 

The above definition of an (u, £)-subgradient is slightly weaker than the notion of an w-subgradient as 
defined in [38], where one requires that f(v) — f(u) > d-{v — u) — ojf(u). But this difference is superficial; 
one could also implement the algorithm in [38] using the weaker notion of an (u, £)-subgradient. It is well 
known (see [5]) that a convex function has a subgradient at every point. One can infer from Definition 2.1 
that, letting d x denote a subgradient of / at x, the Lipschitz constant of / is at most max x \\d x \\. 

Let K be a positive number, and r, g be two parameters with r < 1. Let N = log(^p^). Let G' T C V 
be a discrete set such that for any x G V, there exists x' G G' T with ||sc — x'\\ < -^jt. Define G T = 
G' T U {x + t(y - x),y + t(x - y) : x,y G G' T , t = 2~\ i = 1,.. .,N}. We call G T and G' T , an j^-net 
and an extended -j^-net respectively of V. As shown in [45], if V contains a ball of radius V (where V < 1 
without loss of generality), then one can construct G' T so that \G T \ = poly(log( ; ^)). As mentioned earlier, 
our algorithms make use of the sample average approximation (SAA) method. The following result from 
Swamy and Shmoys [45], which we have adapted to our setting, will be our main tool for analyzing the 
SAA method. 

Lemma 2.2 ([45]) Let f and f be two nonnegative convex functions with Lipschitz constant at most K such 
that at every point x G G T , there exists a vector d x G M m that is a subgradient of /(.) and an (g^r,£)- 
subgradient of /(.) at x. Let x = argmin xg -p/(a;). Then, f(x) < (1 + q) min xG -p f(x) + 6r + 2iV£. 

Lemma 2.3 (Chernoff-Hoeffding bound [21]) Let X\, . . . , be iid random variables with each Xi G [0, 1] 
and n = E [Xi] . Then, Pr[| ± ^ Xi - fi\ > e] < 2e~ 2t2N . 

3 The risk-averse budgeted set cover problem: an illustrative example 

Our techniques can be used to efficiently solve the risk-averse versions of a variety of 2-stage stochastic 
optimization problems, both in the risk-averse budget and robust models. In this section, we illustrate the 
main underlying ideas by focusing on the risk-averse budgeted set cover problem. In the risk averse budgeted 
set cover problem (RASC), we are given a universe U of n elements and a collection S of m subsets of U. 
The set of elements to be covered is uncertain: we are given a probability distribution {pa}a&A of scenarios, 
where each scenario A specifies a subset of U to be covered. The cost of picking a set S G S in the first-stage 
is Wg, and is in scenario A. The goal is to determine which sets to pick in stage I and which ones to pick 
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in each scenario so as to minimize the expected cost of picking sets, subject to Pr^fcost of scenario A > 
B] < p, where p is a constant that is not part of the input. Notice that the costs w A are only revealed when 
we sample scenario A; thus, the "input size", denoted by X is 0(m + n + J2 S log ws + log B). 

For a given (fractional) point x £ M m with < xs < 1 for all S, define f A (x) to be the minimum value 
of w A ■ y A subject to Y,s-.ees Va,s > 1 - Es :eG s x s for e G A and y^.s > for all S. Let V = [0, l] m . As 
mentioned in the Introduction, the set of feasible solutions to even the fractional risk-averse problem (where 
one can buy sets fractionally) is not in general a convex set. We consider the following LP-relaxation of the 
problem, which is a relaxation of even the fractional risk-averse problem (Claim 3.1). Throughout we use 
A to index the scenarios in A, and S to index the sets in S. 



min Y w s x s +^2pa{wsVA,s + WgZ A ,s) (RASC-P) 

S A,S 

s.t. PA r A < p (i) 

A 

( x s + VA,s) + r A > 1 for all A,e € A, (2) 

S:e£S 

Y i x s + VA,s + z A ,s) > 1 for all A, e G A, (3) 

S:e£S 

Y, w sVA,s<B for all A (4) 

s 

xs, VA,s, za,s, r A >0 for all A, S. (5) 

Here x denotes the first-stage decisions. The variable r A denotes whether one exceeds the budget of B 
for scenario A, and the variables y A ,s and z A ,s denote respectively the sets picked in scenario A in the 
situations where one does not exceed the budget (so r A = 0) and where one does exceed the budget (so 
r A = 1). Consequently, constraint (4) ensures that the cost of the y A decisions does not exceed the budget 
B, and (1) ensures that the total probability mass of scenarios where one does exceed the budget is at most 
p. Let OPT denote the optimum value of (RASC-P). 

A significant difficulty faced in solving (RASC-P) is that the scenarios are no longer separable given a 
first-stage solution, since constraint (1) couples the different scenarios. As a consequence, in order to specify 
a solution to (RASC-P) one needs to compute a first-stage solution and give an explicit procedure that 
computes (y A , z A ,r A ) in any given scenario A. In our algorithms however, we can avoid this complication 
because, as we show below, given only the first-stage component of a solution to (RASC-P), one can round 
it to a first-stage solution to the fractional risk-averse problem (and then to an integer solution) losing a small 
factor in the solution cost and the probability-threshold. But observe that if we have a first-stage solution x to 
the fractional risk-averse problem with probability-threshold P such that there exist second-stage solutions 
yielding a total expected cost of C, then one can also easily compute second-stage solutions that yield no 
greater total cost (and where Pr [second-stage cost > B] < P), by simply solving the LP f A (x) in each 
scenario A. This implies that our algorithm for solving (RASC-P) only needs to return a first-stage solution 
to (RASC-P) that can be extended to a near-optimum solution (without specifying an explicit procedure to 
compute the second-stage solutions). 

We show (Theorem 3.3) that for any e, k > 0, one can efficiently compute a first-stage solution x for 
which there exist solutions (y A , z A , r A ) in every scenario A satisfying (2)-(5) such that w 1 ■ x + J2 A p A w A ■ 
(y A + z A ) < (1 + 2e)OPT, and ^ A p A r A < p(l + k). Complementing this, we give a simple rounding 
procedure based on the rounding theorem in [38] to convert a fractional solution to (RASC-P) to an integer 
solution using an LP-based c-approximation algorithm for the deterministic set cover (DSC) problem, that 
is, an algorithm that returns a set cover of cost at most c times the optimum of the standard LP-relaxation 
for DSC. We prove this rounding theorem first, in order to better motivate our goal of solving (RASC-P). 
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Claim 3.1 OPT is a lower bound on the optimum of the fractional risk-averse problem. 

Proof : We show that any solution x to the fractional risk-averse problem can be mapped to a solution 
to (RASC-P) of no greater cost. Let ijA be such that /a(£) = w A ■ y A , so Pr[/^(x) > B] < p. We 
set x = x. For scenario A, if Ja(x) < B, we set r A = 0, yA = jjA, za = 0. Otherwise, we set 
ta = 1, VA = 0, za = i)A- It is easy to see that this yields a feasible solution to (RASC-P) of cost 
wl • * + J2aPaJa(x). ■ 

Theorem 3.2 (Rounding theorem) Let (x, {(yA, za, f A )}) &e a solution satisfying (2)-(5) of objective 
value C = (w A ■ x + Xm^^ ' (SM + z a)), fef P = ^aPa^a- Given any e > 0, one can obtain 

(i) a solution x such that w l ■ x + J2aPa/a(x) < (l + |)C flMtf* Pr^ [/a(£) > (1 + < (1 + 

(ii) an integer solution (x, {j/a}) of cost at most 2c(l + ~)C swc/z f/iaf Pr^u^ • > 2cB(l + ~)1 < 
(1 + e)P ws/ng an LP -based c-approximation algorithm for the deterministic set cover problem. 

Moreover, one only needs to know the first- stage solution x to obtain x and x. 

Proof : Set x = (l + ^)x. Consider any scenario A. Observe that (yA + za) yields a feasible solution to 
the second-stage problem for scenario A. Also, if ta < jt^, then (l + ^)va also yields a feasible solution. 
Thus, we have /a(x) < w A ■ (yA + za) and if ta < then we also have /a(x) < (l + \)B. So 
w 1 ■ x + EaPaIa(x) < (1 + \)C and Pr[f A (x) > (1 + \)B] < £ A:r . A >_^_ Pa < (1 + e)P. 

We can now round x to an integer solution (x, {yA}) using the Shmoys-Swamy [38] rounding procedure 
(which only needs x) losing a factor of 2c in the first- and second-stage costs. This proves part (ii). ■ 



3.1 Solving the risk-averse problem (RASC-P) 

We now describe and analyze the procedure used to solve (RASC-P). First, we get around the difficulty 
posed by the coupling constraint (1) in formulation (RASC-P) by using the technique of Lagrangian relax- 
ation. We take the Lagrangian dual of (1) introducing a dual variable A to obtain the following formulation. 

max — A/9+(min h(A;x) = w 1 ■ x pa9a(A; x)) (LDl) 

A>0 \x£_V — ' / 

~ A 

where qa(A\x) = min ^ w A (y A>s + z A ,s) + Ata s.t. (2)-(4), y A ,s, *A,S, ?A > for all S. (P) 

s 

It is straightforward to show via duality theory that (RASC-P) and (LDl) have the same optimal value, and 
moreover that if (x*, {y* A }, { z a}-> i r A}) * s an optimal solution to (RASC-P) and A* is the optimal value 
for the dual variable corresponding to (1) then (A*;x*,{(y A ,z A ,r A )}) is an optimal solution to (LDl). 
Recall that V = [0,l] m . Let OPT(A) = mm xeT h(A;x). So OPT = m&x A > (OPT(A) - Ap). Let 
A = maxjl, max.A 1 s('w A / w l s)} < which we assume is known. The main result of this section is as follows. 
Throughout, when we say "with high probability", we mean that a failure probability of S can be ensured 
using poly(ln(^))-dependence on the sample size (or running time). 

Theorem 3.3 For any e, 7, k > 0, RiskAlg (see Fig. 1 ) runs in time poly(X, log(^)), and returns with 
high probability a first-stage solution x and solutions (yA, za,ta) for each scenario A that satisfy (2)-(5) 
and such that (i) w 1 ■ x + J2aPA w ■ (yA + za) < (1 + e)OPT + 7; and (ii) s }2,APA r A < p(l + k). 
Under the very mild assumption (*) that w l • x + Ja(x) > I for every A 7^ 0, x G V} we can convert this 
guarantee into a (1 + 2e)-multiplicative guarantee in the cost in time poly(X, 

'A similar assumption is made in [38] to obtain a multiplicative guarantee. 
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RiskAlg (e, 7, k) [e < n < 1; the quantities pM, cosfW, (yA, za, Va) are used only in the analysis.] 

CI. Fix e = e/6, C = 7/4, n = pn/16. Also, set a = e/6, 7' = 7/4, /? = k/8, and p' = p{l + 3k/4). 
Consider the A values Ao, Ai, . . . , Afc, where Ao = 7', Aj+i = Aj(l + er) and k is the smallest value such 
that A a {l + a) k > UB. Note that k = O (\og(^) / a) . 

C2. For each A i; let (x^ , {(y^ , zf , r 1 ^ )}) <- SA-Alg(A; e, 77, C) (here z^, r^ } ) is an optimal solution to 
f/A(Ai; xW) and is implicitly given). Let pW = ^^p^r^ and cost 1 - 1 * 1 = h(Ai] &W) = w 1 •x^ + J2aPa( wA ' 

(»i°+*i°) + A*ri°). 

C3. By sampling n = 2 ^a 2 scenar i° s > f° r eacn i = 0, . . . , fc, compute an estimate p'W = $kr;^ °f P^> 

where £4 is the frequency of scenario yl in the sampled set. 

C4. Ifp'(°) < p' then return x^ as the first-stage solution. [In scenario A, return (yA, za, Ta) = (va i z a i r A )]■ 
C5. Otherwise (i.e., p'' -* > //) find an index i such that p'W > p' and p'^ +1 ) < p' (we argue that such an i must 
exist). Let a be such that a-p'(i) + (1 — a)p'^ +1 - ) = p'. Return the first-stage solution 2; = a-a;W + (l — a)x^ +1 \ 
[In scenario A, return the solution (yA, za,t a) = a (y^A ^ Z ^A ^ r ^A ) + 0- ~ a )(2/A ' z a +1 ^ r A +1 ^)-l 

SA-Alg (A; e, 77, Q [K is (a bound on) the Lipschitz constant of h(A; .); V C 5(0, i?) and contains a ball of 
radius V < 1.] 

Bl. Set r = C/6, N = log(^^). Let G T C 7? be an extended 7^ -net of V as defined in Section 2, so that |G T | = 

poly(log(ff )). Draw N = 87V 2 + ln( 2|G ; |m ) samples and for each scenario A, set p A = M A /N, 
where Ma is the number of times scenario A is sampled. 

B2. Solve the SAA problem min^gp h(A; x), where h(A; x) = w l ■ x + J2aPa9a(A; x) to obtain a solution x. 
Return i and in scenario A, return the optimal solution to <ju(A; x). 

Figure 1: The procedures RiskAlg and SA-Alg. 

Procedure RiskAlg is described in Figure 1. In the procedure, we also specify the second-stage solutions 
for each scenario that can be used to extend the computed first-stage solution to a near-optimal solution to 
(RASC-P). We use these solutions only in the analysis. 

We show in Section 5 that the dependence on is unavoidable in the black-box model. The "greedy 
algorithm" for deterministic set cover [9] is an LP-based Inn-approximation algorithm, so Theorem 3.3 
combined with Theorem 3.2 shows that for any e, k, e > one can efficiently compute an integer solution 
(x,{va}) of cost at most 2 Inn (1-l-e+i)- OPT such that PrA^-yA > 2Plnn(l+e+±)] < p(l+K+e). 

Algorithm RiskAlg is essentially a search procedure for the "right" value of the Lagrangian multiplier 
A, wrapped around the SAA method, which is used in procedure SA-Alg to compute a near-optimal solution 
to the minimization problem min xe p h(A; x) for any given A > 0. Theorem 3.4 states the precise approx- 
imation guarantee satisfied by the solution returned by SA-Alg. Given this, we argue that by considering 
polynomially many A values that increase geometrically up to some upper bound U B, one can find effi- 
ciently some A where the solution (x, {(yA, za, ^a)}) returned by SA-Alg for A is such that J2APA r A is 
"close" to p. This will also imply that this solution is a near-optimal solution. We set UB = 16(^ 5 w l s )/p, 
so log UB is polynomially bounded. However, the search for the "right" value of A and our analysis are 
complicated by the fact that we have two sources of error whose magnitudes we need to control: first, 
we only have an approximate solution (x, {(yA, za, »*a)}) for A, which also means that one cannot use 
any optimality conditions; second, for any A, we have only implicit access to the second-stage solutions 
{yA, za, r A)} computed by Theorem 3.4, so we cannot actually compute or use ^2 A PAfA in our search, 
but will need to estimate it via sampling. 

Theorem 3.4 For any A > 0, and any £,C,V > 0, SA-Alg runs in time poly(X, j-, log(^)) and returns, 
with high probability, a first-stage solution x such that h(A; x) < (1 + e) OPT(A) + 77A + £. 
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Analysis. For the rest of this section, e, 7, k are fixed values given by Theorem 3.3. We may assume 
without loss of generality that e < k < 1. We prove Theorem 3.4 in Section 3.1.1. Here, we show how this 
leads to the proof of Theorem 3.3. Given Theorem 3.4 and Lemma 2.3, we assume that the high probability 
event "Vi , cost^ < (1 + e) OPT(Ai) + i]Ai + ( and |p' w - p w \ < 0p" happens. 

Claim 3.5 We have p^ < p/2 and p'^ < p/2. 

Proof : If p<*> > then cost^ - 7]A k > 2(1 + s)(J2s w s) > ( l + e)OPT(A k ) + (, which is a 

contradiction. The last inequality follows since OPT(A) < Wg for any A. Therefore, p^ < p/2, and 
p'(fc) < p (fc) +(3p< p/2. ■ 

Proof of Theorem 3.3 : Let x be the first-stage solution returned by RiskAlg, and (yA, za,ta) be the 
solution returned for scenario A. It is clear that (2)-(5) are satisfied. Suppose first that < p' (so 
x = x(°\) Pait (ii) of the theorem follows since p(°) < p'(°) + ftp < p(l + k). Part (i) follows since 

w 1 -x ( - 0) + ^2 PA w A -(yf+zf) < h(r/;x) < (l+e)OPT( 7 ')+r n ' +C < {l+e)OPT +i {l+e+^+C- 

A 

The penultimate inequality follows because for any A, we have OPT(A) < OPT(0) + A < OPT + A. 

Now suppose that p'(°> > p'. In this case, there must exist an i such that p'W > p', and p'( 4+1 ) < p' 
because p'(°> > p' and p'W < p' (by Claim 3.5), so step C4 is well defined. We again prove part (ii) first. 
We have ^2 A PA r A = a ■ + (1 — a)p(* +1 ) < p' + ftp < p(l + k). To prove part (i), observe that 
w 1 ■ x + Y,aPaw A ■ (VA + za) < a • cost^ + (1 - a) • cost^ - A* (a -pW + (1 - a) -p( i+1) )> which 
is at most 

(1 + e)(a • OPT(Ai) + (1 - a)OPT(A i+1 )) + 7/(aA* + (1 - a)A i+1 ) + C - A;(p' - /3p). 

Now noting that A i+ i = (1 + a)A { , it is easy to see that OPT(A i+l ) < (1 + a)OPT(Ai). Also, 
p'-Pp-r)(l + a)> (1 + e + 2cr)p. So the above quantity is at most (1 + e + 2a) ( OPT(Aj) - A^p) + C < 
(l + e)OPT + 7 . 

The running time is the time taken to obtain the solutions for all the Aj values plus the time taken to 
compute p'W for each i. This is at most (k + 1) • poly (Z, log(-^)) + O(^ttt), using Theorem 3.4. Note 

that log(Afc) is polynomially bounded. Plugging in e, 77, £, /3, and /c, we obtain the poly(Z, log(-)) 
bound. 

Proof of multiplicative guarantee. To obtain the multiplicative guarantee, we show that by initially sam- 
pling roughly max{l/p, A} times, with high probability, one can either determine that x = is an optimal 
first-stage solution, or obtain a lower bound on OPT and then set 7 appropriately in RiskAlg to obtain 
the multiplicative bound. Recall that /a(x) is the minimum value of w A ■ i/a over all da > such that 
Y^s-.e^sVAS > 1 - 12s-.eeS Xs for e e A Call ^4 = a null scenario. Let q = J^a-.a^Pa and 
a = min{p, 1/A}. Note that OPT > q. Let za be an optimal solution to /a(0). Define a solution 
(yA, za, t a) for scenario A as follows. Set (yA,ZA,r a) = (0,0,0) if A = 0, and (0, za, 1) if A / 0. 
We first argue that if q < a, then (0, {(?m> za, m)}) is an optimal solution to (RASC-P). It is clear that 
the solution is feasible since YIaPaTa = Q < P- To prove optimality, suppose (x*, {(y^, z^, r^)}) is an 
optimal solution. Consider the solution where x = and the solution for scenario A is (0, 0, 0) if A = 0, 
and (0, z* A + y* A + x*, 1) otherwise. This certainly gives a feasible solution. The difference between the cost 
of this solution and that of the optimal solution is at most J2a-a^$Pa wA -x* — w l ■ x*, which is nonpositive 
since w A < Xw l and q < 1/A. Setting za = za for a non-null scenario can only decrease the cost, and 
hence, also yields an optimal solution. 
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Let 5 be the desired failure probability, which we may assume to be less than | without loss of generality. 

We determine with high probability if q > a. We draw M = ^l/*^ samples and compute X = number of 
times a non-null scenario is sampled. We claim that with high probability, if X > then OPT > LB = 
in(i/6) ' a; * n tn ^ s case > we re turn the solution RiskAlg(e, eLB, k) to obtain the desired guarantee. Otherwise, 
if X = 0, we return (0, {(va, z~a, ^a)}) as the solution. 

Let r = Pr[X = 0] = (1 - q) M . So 1 - qM < r < e~ qM . \fq> lnQ) /M, then Pi[X = 0] < 5, so 
with probability at least 1 — 5 we say that OPT > LB, which is true since OPT > q > a. If q < 5/M, 
then Pt[X = 0] > 1 — 5 and we return (0, {(va, za, ^a)}) as the solution, which is an optimal solution 
since q < a. If 5/M < q < ln(i) /M, then we always return a correct answer since it is both true that 
OPT > q > LB, and that (0, {{va, za,ta)}) is an optimal solution. ■ 



3.1.1 Proof of Theorem 3.4 

Throughout this section, e, -q, £ are fixed at the values given in the statement of Theorem 3.4. Let (BSC- 
P) denote the problem min^-p h(A; x). The proof proceeds by analyzing the subgradients of h(A; .) and 
/i(A; .) and showing that Lemma 2.2 can be applied here. 

We first note that the arguments given in [38, 45, 7] for 2-stage programs do not directly apply to 
(BSC-P) since it does not fall into the class of problems considered therein. Shmoys and Swamy [38] show 
(essentially) that if one can compute an (uj, £)-subgradient of the objective function h(A; .) at any given 
point x for a sufficiently small to, £, then one can use the ellipsoid method to obtain a near optimal solution to 
(BSC-P). They argue that for a large class of 2-stage LPs, one can efficiently compute an (lo, £)-subgradient 
using poly(^) samples. Subsequently [45], they leveraged the proof of the ellipsoid-based algorithm to 
argue that the SAA method also yields an efficient approximation scheme for the same class of 2-stage LPs. 
These proofs rely on the fact that for their class of 2-stage programs, each component of the subgradient 
lies in a range bounded multiplicatively by a factor of A and can be approximated additively using poly (A) 
samples. However, in the case of (BSC-P), for a subgradient d = (ds) of h(A; .), we can only say that ds G 
[~ w s ~ w sl ( see Lemma 3.6), which makes it difficult to obtain an (u, £)-subgradient using sampling 
for suitably small to, £. Charikar, Chekuri and Pal [7] considered a similar class of 2-stage problems, and 
gave an alternate proof of efficiency of the SAA method showing that even approximate solutions to the 
SAA problem translate to approximate solutions to the original problem. Their proof shows that if A is such 
that <7yi(A; x) — #a(A; 0) < Aw 1 ■ x for every A and x G V, then poly(Z, -) samples suffice to construct 
an SAA problem whose optimal solutions correspond to (1 + e)-optimal solutions to the original problem. 
But for our problem, we can only obtain the bound A < w A ■ x + A(^ 5 x$) < \w l ■ x + A Y^s x s> an d 
A might be large compared to w l ■ x. 

The key insight that allows us to circumvent these difficulties is that in order to establish our (weak) 
guarantee, where we allow for an additive error measured relative to A, it suffices to be able to approximate 
each component ds of the subgradient of h(A; .) within an additive error proportional to (w l s + A), and 
this can be done by drawing poly(A) samples. This enables one to argue that functions h(A; .) and h(A; .) 
satisfy the "closeness-in-subgradients" property stated in Lemma 2.2. 

The subgradients of h(A; .) and h(A; .) at x are obtained from the optimal dual solutions to <7a(A; x) 
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for every A. The dual of ^(A; x) is given by 

max ^2(a A ,e + &i,e)(l ~ J2 xs ) ~ B ° A (D) 

e S:eeS 

s.t. ( a ^e + 0A,e) < v4(l + *a) for all S 

e£SnA 

#V < w A for all S 

^ «A, e < A 
&A,e, Pa,c > for all e£ A. 

Here and @A,e are respectively the dual variables corresponding to (2) and (3), and 6a is the dual 
variable corresponding to (4). As in [38], we then have the following description of the subgradient of h. 

Lemma 3.6 Let (ct A , @a^*a) ^ e an °pdmal dual solution to <^i(A; x). Then the vector d x with components 
d x ,s = Ws ~ J2aPa Heesl a *A,e + Pl,e) is a subgradient ofh(A; .) at x. 

Since h(A; .) is of the same form as /i(A; .), we have similarly that d x = (d Xi s), where d x s = w l s — 
YIiaPa ^2ees( a A e + ffi e ).i sa subgradient of h(A; .) at x. Since d x and d x both have £2 norm at most 
A 1 1 -u^ 1 1 1 + I A|, h(A; .) and h(A; .) have Lipschitz constant at most K = \\\w l \\ + |A|. 

Lemma 3.7 Let d be a subgradient ofh(A; .) at the point x G V, and suppose that d is a vector such that 
d-s £ [ds — — £/2m, ds + u>w l s + £,/2m]for all S. Then d is an {oj ^-subgradient ofh(A; .) at x. 

Proof : Let y be any point in V. We have h(A; y) — h(A; x) > d • (y — x) + (d — d) • {y — x). The second 
term is at least 

X (d s -d s )ys+ X (d s -d s )x s > ^(-uwgys-wwsxs) -£ > -uh(A;y)-ujh(A;x)-£. 
S:d s <d s s-.d s >d s S ^ 

In the sequel, we set uj = e/8N, £ = rjA/2N. Let (a* A ,/3 A ,9 A ) be the optimal dual solution to 
<?a(A; x) used to define d x and d x . Notice that d Xt s is simply w l s — ^2 e£ s{ a *A e + ftJ averaged over the 
scenarios sampled independently to construct the SAA problem h(A; .), and E [d X} s] — d x> s- The sample 
size M in SA-Alg is specifically chosen so that the Chernoff bound (Lemma 2.3) implies that \d Xi s — d Xt s\ < 
ujw 1 s + ^ / 2m for all S with probability at least 1 — -J^ for every x € G T ; hence, d x is an (to, £)-subgradient 

of h(A; .) at x (by Lemma 3.7). So taking the union bound shows that with probability at least 1 — 5, h(A; .) 
and h(A; .) satisfy the conditions of Lemma 2.2 with K = AH^x; 1 1| + |A|, q = s and £ (as above), which 
yields the desired approximation guarantee. 

We can take R = ^/m and V = \ here, so the number of samples M is poly (X, A ; J g( A)) _ ■ 

Remark 3.8 Notice that nowhere do we use the fact that the scenario-budgets are uniform, and thus, our 
results (Theorem 3.4 and hence, Theorem 3.3) extend to the setting where we have different budgets for the 
different scenarios. The scenario budgets {B A } are now not specified explicitly; we get to know B A when 
we sample scenario A. (Notice that we may assume that B A < A J2s w 5 f° r au -^•) 
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3.2 Risk-averse robust set cover 

In the risk-averse robust set cover problem, the goal is to choose some sets x in stage I and some sets i/a in 
each scenario A so that their union covers A, so as to minimize w l ■ x + Q p [w A ■ i/a] ■ Recall that Q p [w A ■ i/a] 
is the (1 — p)-quantile of {w A ■ ua}a£A> that i s > the smallest B such that Pr^fu/ 1 • i/a > B] < p. As 
mentioned in the Introduction, risk-averse robust problems can be essentially reduced to risk-averse budget 
problems. We briefly sketch this reduction here for the set cover problem. The same ideas can be used 
to obtain approximation algorithms for the risk-averse robust versions of all the applications considered in 
Section 4. 

We use the common method of "guessing" B = Q p [w A ■ i/a] for an optimal solution. Given this guess, 
we need to find integral (x, {ua}) so as to minimize w l ■ x + B (and hence, w 1 ■ x) subject to the constraint 
that x + da forms a set cover for A and and Pr^ [w A -i/a > B] < p. This looks very similar to the risk-averse 
budgeted set cover problem; the only difference is that the expected second-stage cost does not appear in 
the objective function. Thus, one can write an LP-relaxation for the (fractional) risk-averse robust problem 
that looks similar to (RASC-P) except that the objective function is now w l ■ x, and constraint (3) and the 
variables za,s can be dropped. After Lagrangifying (1) using the dual variable A, we obtain the following 
problem 

max — Ap+fmin h'(A;x) = w l • x + ^^p^g^A; x)\ (LD2) 

~ A 

where g' A (A;x) = min{Ar A : (2),(4),y A >0,r A > 0}. 

Let OPT Rob denote the optimum value of the fractional risk-averse robust problem min xg p(u; 1 • x + 
Q P [fA(x)]), and OPT Rob (B) denote the optimum value of (LD2) for a given B > 0. Note that OPT Rob (B) 
decreases with B. We prove that for any B > and A > 0, SA-Alg returns a solution to the inner 
minimization problem in (LD2) that satisfies the approximation guarantee stated in Theorem 3.4. Arguing 
as in the proof of Theorem 3.3, this implies that RiskAlg can be used to obtain a near-optimal solution to 
(LD2) while violating the probability threshold by a small factor. 

The claimed approximation guarantee for SA-Alg follows because h(A; .) and its sample-average ap- 
proximation h'(A; .) constructed in SA-Alg satisfy the closeness-in-subgradients property of Lemma 2.2. 
Let a Ae is the value of the dual variable corresponding to (2) in an optimal dual solution to g' A (A;x). 
Note that ^ e a* A < A for all A. Similar to Lemma 3.6, we now have that the vectors d x = (d x ,s) with 

d x ,S = Ws - EAPA(E eG s Q A,e) and 4 = (4,s) with d X)S = - Y.APA{Y Je &s a V) we respec- 
tively subgradients of h'(A; .) and h'(A; .) at x. Let N,M, r, G T be as defined in SA-Alg with R = yfrn, 
V = \ and K = \\w l \\ + \A\. Using N samples, for any x E G T , with very high probability we have 
that \d Xj s — d Xj s\ < rjA/AmN; thus, as in Lemma 3.7, d x is an (0, 7^r)-subgradient of h'{A; .) at x. So 
Lemma 2.2 shows that SA-Alg returns a solution x such that /i'(A; x) < OPT + rjA + £ with high prob- 
ability. Notice that in fact, the approximation guarantee obtained via SA-Alg is purely additive. Also, one 
can avoid the dependence of the sample-size on A (and e) here since the modified form of the subgradient 
means that we can ensure that \d Xj s — d Xj s\ < r/A/Amn for every x € G T and component S using a number 
of samples that is independent of A. This implies that for any e, 7, k > 0, RiskAlg computes (nonnegative) 
{x,{y A ,r A }) satisfying (2), (4) such that w l ■ x < (1 + e)OPT Rob (B) + 7 and Y^aVata < p(l + n). 

To complete the reduction, we describe how to guess B. Let W = w^, which is an upper bound on 
the optimum (with log W polynomially bounded). We use the standard method of enumerating values of B 
increasing geometrically by (1 + e); we start at 7 and end at the smallest value that is at least W. So if B* 
is the "correct" guess, then we are guaranteed to enumerate B' £ [B*, (1 + e)B* + 7]. We use RiskAlg to 
compute the solution for each B, and return (x, {i/a, ^a]) that minimizes w l ■ x + B. Let (2/, {y' A , r' A }) 
be the solution computed for B' . Then we have w l ■ x + B < w l ■ x' + B' < (1 + e)OPT Roh (B') + 
(1 + e)B* + 27 < (1 + e)OPT Rob + 27. We remark that the same techniques yield a similar guarantee 
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for the LP-relaxation of a generalization of the problem, where we wish to minimize w ■ x plus a weighted 
combination of [w A ■ y^l and Q p [w A ■ y A ]. 

We can convert the above guarantee into a purely multiplicative one under the same assumption (*) 
stated in Theorem 3.3. Let q = Y1a^$PA- Notice that if q < p, then OPTji b = and x = is an optimal 

solution, and otherwise OPTn b > 1. Let 5 be such that (1 + n) wfrg) < L Using ln ^/^ samples we can 
determine with high probability if q < p' or if q > p. In the former case, we return x = and tja in scenario 
A, where yA = if A = and is any feasible solution if A ^ 0. Note that w l ■ x + Q p < [w A ■ yA] = 0. In the 
latter case, we set 7 = e, and obtain a execute the procedure detailed above to obtain a (l+3e)-multiplicative 
guarantee. 

Finally, one can use Theorem 3.2 to round the fractional solution to an integer solution, or to a so- 
lution to the fractional risk-averse robust problem. (The violation of the budget B can now be absorbed 
into the approximation ratio.) For any e, k, e > 0, we obtain a fractional solution x such that w l • x + 
Q p (l+ K+e )[fA(x)] < {l+e+\)OPT Rob , and an integer solution {x,{y A }) such that w l -x+Q p{l+K+e) [w A - 
VA\ <2c(l + e+i) OPT R b using an LP-based c-approximation algorithm for deterministic set cover. 

Setting B = above yields a problem that is interesting in its own right. When B = 0, we seek a 
minimum-cost collection of sets x that are picked only in stage I such that Pr^ [x is not a set cover for A] < 
p. That is, we obtain a chance-constrained problem without recourse. As shown above (although B = 
is not one of our "guesses"), we can solve this chance-constrained set cover problem to obtain a solution x 
such that w l ■ x < (1 + e) OPTR o b(0) + 7 where Pr^fx does not cover A] < p(l + k). 

4 Applications to combinatorial optimization problems 

We now show that the techniques developed in Section 3 for the risk-averse budgeted set cover problem can 
be used to obtain approximation algorithms for the risk-averse versions of various combinatorial optimiza- 
tion problems such as covering problems — (set cover,) vertex cover, multicut on trees, min s-t cut — and 
facility location problems. This includes many of the problems considered in [17, 38, 11] in the standard 
2-stage and demand-robust models. 

In all the applications, the first step is to argue that procedure RiskAlg can be used to obtain a near- 
optimal solution to a suitable LP-relaxation of the problem while violating the probability threshold by a 
small factor. Theorem 3.3 proves this for covering problems; for multicommodity flow and facility location, 
we need to modify the arguments slightly. The second step, which is more problem-specific, is to round the 
LP-solution to an integer solution. Analogous to part (i) of Theorem 3.2, we first round the LP-solution to a 
solution to the fractional risk-averse problem. Given this, our task is now reduced to rounding a fractional 
solution to a standard 2-stage problem into an integral one. For this latter step, one can use any "local" 
LP-based approximation algorithm for the 2-stage problem, where a local algorithm is one that preserves 
approximately the cost of each scenario. (For set cover, vertex cover and multicut on trees, we may use 
part (ii) of Theorem 3.2 directly, which utilizes the local LP-rounding algorithm in [38] (which in turn is 
obtained using an LP-based approximation algorithm for the deterministic covering problem).) As in the 
case of risk-averse robust set cover, our results extend to the setting of non-uniform budgets. 

We say that an algorithm is a {c\, C2, C3) -approximation algorithm for the risk-averse problem with 
budget B and threshold p, if it returns a solution of cost at most c\ times the optimum where the probability 
that the second-stage cost exceeds C2 • B is at most C3 • p. 

Our approximation results for the budgeted problem also translate to the risk-averse robust version of 
the problem. Specifically, a [c\, C2, C3) -approximation algorithm for the budgeted problem implies that one 
can obtain an integer solution (x,{yA}) to the robust problem such that c(x) + Q p (i+ C3 )[fA{x,yA)] < 
max{ci, C2} • OPTjiot,. As mentioned in Section 3.2, the robust problem with a guess of Q p [fA(x, yA)] = 
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gives rise to a problem where one can take actions only in stage I and one seeks to "take care" of "most" 
second-stage scenarios; we can solve this chance-constrained problem approximately. We also achieve 
bicriteria approximation guarantees for the problem of minimizing c(x) plus a weighted combination of 

^A[fA(x,y A )} and Q p [f A (x,y A )}. 

4.1 Covering problems 

Vertex cover and multicut on trees. In the risk-averse budgeted vertex cover problem, we are given a 
graph whose edges need to covered by vertices. The edge-set is random and determined by a distribution 
(on sets of edges). A vertex v may be picked in stage I or in a scenario A incurring a cost of u>* or 
respectively. We are also given a budget B and a probability threshold p and require that the probability that 
the second-stage cost of picking vertices exceeds B be at most p. In the risk-averse version of multicut on 
trees, we are given a tree, a (black-box) distribution over sets of Sj-tj pairs, a budget B, and a threshold p. 
The goal is to choose edges in stage I and in each scenario such that the union of edges picked in stage I and 
in scenario A forms a multicut for the Si-U pairs that are revealed in scenario A. Moreover, the second-stage 
cost of picking edges may exceed B with probability at most p. The goal is to minimize the total expected 
cost. 

Both these problems are structured cases of risk-averse budgeted set cover. So one can formulate an 
LP-relaxation of the risk-averse problem exactly as in (RASC-P) and by Theorem 3.3, obtain a near-optimal 
solution to the relaxation. We may then apply Theorem 3.2 directly to these problems to round the frac- 
tional solution. Since there is an LP-based 2-approximation algorithm for the deterministic versions of both 
problems, we obtain the following theorem. 

Theorem 4.1 For any e, k,s > 0, there is a (4(1 + e+ ~), 4(l + e+ ~), 1 + K+e) -approximation algorithm 
for the risk-averse budgeted versions of vertex cover and multicut on trees. 

Min s-t cut. In the stochastic min s-t cut problem, we are given an undirected graph G = (V, E) and a 
source s £ V. The location of the sink t is random and given by a distribution. We may pick an edge e in 
stage I or in a scenario A incurring costs w e and respectively. The constraints are that in any scenario A 
with sink t A , the edges picked in stage I and in that scenario induce an s-t A cut, and the goal is to minimize 
the expected cost of choosing edges. In the risk-averse budgeted problem there is the additional constraint 
that the the second-stage cost may exceed a given budget B with probability at most (a given value) p. 

The LP-relaxation of the risk-averse problem based on a path-covering formulation is a special case 
of (RASC-P). The only additional observation needed to see that Theorem 3.3 can be applied here is that 
the covering problem (P) for a scenario A (and its dual) can be solved efficiently although there are an 
exponential number of constraints. Thus, procedures RiskAlg and SA-Alg can be implemented efficiently 
and we may obtain a near-optimal solution to the relaxation. 

We use Theorem 3.2, part (i) to convert the solution to a near-optimal solution x to the fractional risk- 
averse problem. We now use the algorithm in [11], which is a local LP-based 0(log -approximation 
algorithm to round this solution to an integral one. Their algorithm requires that there exist multipliers X A 
in each scenario A such that w£ = X A w e for every e; consequently we also need this for our result. A 
detail worth noting is that their algorithm requires access also to the second-stage fractional solutions (but 
not the scenario-probabilities). But this is not a problem since there are only polynomially many scenarios 
here corresponding to the different locations of the sink. So given the first-stage solution x, one can simply 
compute the optimal fractional second-stage solution for each scenario for use in their algorithm. 

Theorem 4.2 For any e,K,e > 0, there is an (0(log|F|)(l + e + ±), (9(log \V\)(1 + e + ±), 1 + k + e)- 
approximation algorithm for risk-averse budgeted min s-t cut. 
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4.2 Facility location problems 

In the risk-averse budgeted facility location problem (RAUFL), we have a set of m facilities T, a client-set 
V, and a distribution over client-demands. We may open facilities in stage I or in a given scenario, and in 
each scenario A, for every client j with non-zero demand d A , we must assign its demand to a facility opened 
in stage I or in that scenario. The costs of opening a facility i G T in stage I and in a scenario A are fj and 
ff - respectively; the cost of assigning a client j's demand in scenario A to a facility i is d^Cij, where the 
Cjj's form a metric. The first-stage cost is the cost of opening facilities in stage I, and the cost of scenario 
A is the sum of all the facility-opening and client-assignment costs incurred in that scenario. The goal is to 
minimize the total expected cost subject to the usual condition that the probability that the second-stage cost 
exceeds B is at most some threshold p. For notational simplicity, we consider the case of {0, l}-demands, 
so a scenario A C V simply specifies the clients that need to be assigned in that scenario. We formulate the 
following LP-relaxation of the problem. Throughout, i indexes the facilities in T and j the clients in V. 

min fiVi + 2 PA (Yl fa 

,i + VA,i) + ^2 °ij( x Aij + u Aij) S ) (RAFL-P) 

i ACT) i j&A,i 

S.t. ^PATA < P (6) 

A 

Y x A,ij + r A > 1 for all j £ A (7) 

i 

^2( x A,ij + u A ,ij) > 1 for all j € A (8) 

i 

XA,ij <Vi + VA,i for all j G A, i (9) 

XA,ij + u A ,ij <Vi + VA,i + VA,i for all j <E A,i (10) 

Y $tvA,i + c ijXA ,ij < B for all A (11) 

i jeA,i 

yi,VA,i,VA,i,XA,ij,UA,ij,rA>0 forall A,i,j. (12) 

Here yi denotes the first-stage decisions. The variable ta denotes if one exceeds the budget B in scenario A; 
(6) limits the probability mass of such scenarios to at most p. The decisions {xA,ij,VA,i) and {uA,ij, va,i) 
are intended to denote the decisions taken in scenario A in the two cases when does not exceed the budget, 
and when one does exceed the budget respectively. Correspondingly, (7) and (8) enforce that every client is 
assigned to a facility in these two cases, and (9) and (10) ensure that a client is only assigned to a facility 
opened in stage I or in that scenario in these two cases. Finally, (11) is the budget constraint for a scenario. 

Let OPT be the optimal value of (RAFL-P). Given first-stage decisions y G [0, l] m , let £a{u) denote 
the minimum cost of fractionally opening facilities and fractionally assigning clients in scenario A to open 
facilities (i.e., facilities opened to a combined extent of 1 in stage I and scenario A). Let V = [0, l] m . As in 
Section 3, we Lagrangify (6) using a dual variable A > to obtain the problem maxA>o (— Ap+ OPT(A)) 
where OPT(A) = min y€V h(A; y)), h(A; y) = f l ■ y + Y,aPA9a(^;v), and su(A;y) is the minimum 
value of YjiffiVAj + VA,i) + Y,j&A,i c ij( x A,ij + u A ,ij) + Ar A subject to (7)-(12) (where the y/s are 
fixed now). As in Claim 3.1, it is easy to show that OPT is a lower bound on the optimal value of even the 
fractional risk-averse problem. 

Theorem 4.3 For any e, 7, k > 0, in time poly (X, log(^)), one can use RiskAlg to compute (with high 
probability) (y, {(xa, UA, ua, va, fa)}) that satisfies (7)-(12) with objective value C < (1 + e) OPT + 7 
such that ^APA r A < p(l + K )- This can be converted to a (1 + 2e)-guarantee in the cost provided 
f-V + Ia{v) > l for every y G [0, lp, A + 0. 
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Proof : Examining procedure RiskAlg, arguing that RiskAlg can be used to approximately solve (RAFL-P) 
involves two things: (a) coming up with a bound UB such that log UB is poly normally bounded so that one 
can restrict the search for the right value of A in RiskAlg; and (b) showing that an optimal solution to the 
SAA-version of the inner-minimization problem for any A > constructed in SA-Alg yields a solution to 
the true minimization problem that satisfies the approximation guarantee in Theorem 3.4. 

There are two notable aspects in which the risk-averse facility location differs from risk-averse set cover. 
First, unlike in set cover, one cannot ensure that the cost incurred in a scenario is always by choosing the 
first-stage decisions appropriately. Thus, the problem (RAFL-P) may in fact be infeasible. This creates 
some complications in coming up with an upper bound UB for use in RiskAlg. We show that one can detect 
by an initial sampling step that either the problem is infeasible, or come up with a suitable value for UB. 
Second, due to the non-covering nature of the problem, one needs to delve deeper into the structure of the 
dual LP for a scenario (after Lagrangifying (6)) to prove the closeness-in-subgradients property for SAA 
objective function constructed in SA-Alg and the true objective function. 

Define Ca = X^' 6 A( mm i This is the minimum possible assignment cost that one can incur in 
scenario A. We may determine with high probability using O(^) samples if Pr^fC^ > B] > p or 
Pr^fCyi > B] < p(l + §§)• In the former case, we can conclude that the problem is infeasible. In the 

latter case, we set p = p(l + |g), k such that p(l + k) = p(l + k), and UB = 32(1+e) ^ /iI+jB) , and call 
procedure RiskAlg with these values of p, k and UB (and the given e, 7). We prove in Claim 4.4 below that 
with this upper bound, pw, p'^ < p' = p(l + 3&/4); this is the only condition required for the search for 
A in RiskAlg. 

Task (b) boils down to showing that the objective function h(A; .) of the SAA-problem in SA-Alg and the 
true problem h(A; .) satisfy the conditions of Lemma 2.2. Due to the non-covering nature of the formulation, 
we need to derive additional insights about optimal dual solutions to gA(A;y) to prove this. Lemma 4.5 
proves that this holds with high probability, with K = AH/ 1 )! + |A|, g = e and £ = ^A. So by Lemma 2.2, 
the solution y = argmin J/G p/i(A; y) returned by SA-Alg satisfies the requirements of Theorem 3.4. As in 
the set cover problem, we may take R = ^pm, V = \, which ensures that the sample size is polynomially 
bounded. The proof of the conversion to a multiplicative guarantee is as in Theorem 3.3. ■ 

Recall that A*. > UB andp^ = YIaPA^A > where (y, {(xa, UA, ua, va, '"a)}) is the solution returned 
by SA-Alg for A k of cost h(A k ;y) < (1 + e) OPT(A k ) + r]A k + ( with e, r), ( set as in RiskAlg. 

Claim 4.4 We have < p' and p'^ < p', where p' = p(l + 3k/ A). 

Proof : Let F = ^ fj and q = Pr a [Ca > B] < p(l + 5«/28). Consider the solution y with y { = 1 for 
all i. For any A > 0, we have OPT(A) < h{A; y) < F + J2aPaCa + qA < F + B + qA. Suppose 
p(k) > p' -pp. Then cost^-rjA k > A k p(l+9k/16) > A fc p(l+9K/16), where the last inequality follows 
since p(l + k) = p(l + n) and p > p. Also (l + e)OPT(A k ) + ( < 2(l + e){F + B) + {l + e)qA k < 2(1 + 
e)(F + B) + A k p(l + 3K/8) since e = e/6 < k/6 < «/6. But then cost^-rjA k > (l + e)OPT(A k ) + ( 
which gives a contradiction. So p^ < p' — ftp, which implies that p'^ < p'. ■ 

Lemma 4.5 With probability at least 1 — 5, h(A; .) and h(A; .) satisfy the conditions of Lemma 2.2 with 
K = AH/ 1 ]] + |A|, g = e and £ = ^. 

Proof : Consider a point y G V. Consider an optimal dual solution to y) where a* A ■, ijf A ■, f3* A ^ , Y* A 

are the optimal values of the dual variables corresponding to (7)-(l 1) respectively. Note that qa (A; y) equals 
the objective value of this dual solution, which is given by 

£«; + Pa*) - E^(E(^« + r ^^)) - B ■ 9 A- 

i&A i jeA 
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We choose an optimal dual solution that minimizes Ylij^Aij- ^ s m Lemma 3.6, it is easy to show that 
the vectors d y = (d yji ) and d y = (d y ^) given by d y ,i = fj - EaP^ SjeA^Aij + T *A,ij) and d y,i = 
fi ~ YIaPa YljeA^Xij + T Xij) we res P ectivel y subgradients of h(A; .) and h(A; .) at y. 
Now we claim that for every i, 

Ej P*A,ij - A and T \ij - Given this ' \\dyl\\d y \\ < K where 
K = AH/ 1 )! + A f° r an y V ^ V, so K is an upper bound on the Lipschitz constant of h(A; .) and /i(A; .). 

The second inequality is a constraint of the dual, corresponding to variable va,i- Suppose (3* A { - > 
for some j. The dual enforces the constraint a A j + ip A j < % (1 + 6* A ) + PaiJ + ^*Aij> corresponding 
to variable XA.ij- We claim that this must hold at equality. By complementary slackness, we have x* A - = 
Hi + V*Ai where (x A , y A , u A , v A ) is an optimal primal solution to <m(A; y). So if y-i > then x* A i ■ > 
and complementary slackness gives the desired equality. If yi = and the above inequality is strict, then 
we may decrease (5* A ^ while maintaining dual feasibility and optimality, which gives a contradiction to the 
choice of the dual solution. Thus, since the dual also imposes that tp A ■ < Cij + T* A - ■ (corresponding to 
UA,ij), we have that (3* A i • < a* A •, so J2j fi*A ij — J2j a *Aj — ^ ( tne l ast inequality follows from the dual 
constraint for r^). 

As in Lemma 3.7, if d is a subgradient of h(A; .) at y and d is a vector such that \di — di\ < ujf} + 
then d is an (lo, ^)-subgradient of h(A; .) at y. 

Since E [d V: i\ = d V: i for every y and i, plugging in the sample size AT used in SA-Alg and using the 
Chernoff bound (Lemma 2.3), we obtain with probability at least 1 — 5, \d y ^ — d y ^\ < ^fj + for all i, 
for every point y in the extended -^y-net G T of V. Thus, with probability at least 1 — 5, d y is an (g^, ^-)- 
subgradient of h(A; .) at y for every y G G T . ■ 

We now discuss the rounding procedure. Analogous to part (i) of Theorem 3.2, it is not hard to see that if 
(y, {(xa, VA, ua, va, r A)}) is a solution satisfying (7)-(12) of objective value C with P = Yl,APA r A, then 
for any e > 0, taking y = y{l + ±) gives + Y.A U{v) < (l + and Pr[£ A (y) > (l + ±)£] < 

(l+e)P. So now one can use a local approximation algorithm for 2-stage stochastic facility location (SUFL) 
to round y. 

Shmoys and Swamy [38] show that any LP-based c-approximation algorithm for the deterministic fa- 
cility location problem (DUFL) that satisfies a certain "demand-obliviousness" property can be used to 
obtain a min{2c, c + 1.52} -approximation algorithm for SUFL, by using it in conjunction with the 1.52- 
approximation algorithm for DUFL in [26]. "Demand-obliviousness" means that the algorithm should round 
a fractional solution without having any knowledge about the client-demands, and is imposed to handle the 
fact that one does not have the second-stage solutions explicitly. There are some difficulties in applying this 
to our problem. First, the resulting algorithm for SUFL need not be local. Secondly, and more significantly, 
even if we do obtain a local approximation algorithm for SUFL by the conversion process in [38], the re- 
sulting algorithm may be randomized, if the c-approximation algorithm for DUFL is randomized. This is 
indeed the case in [38]; they obtain a randomized local 3.378-approximation algorithm using the demand- 
oblivious, randomized 1.858-approximation algorithm of Swamy [43]. (This was improved to a randomized 
local 3.25-approximation algorithm by Srinivasan [42], again using the algorithm in [43].) Using such a 
randomized local c'-approximation algorithm for SUFL would yield a random integer solution such that 
there is at least al — p(l + k + e) probability mass in scenarios for which the expected cost incurred, where 
the expectation is over the random choices of the algorithm, is at most c'B{l + But we would like to 
make the stronger claim that, with high probability over the random choices of the algorithm, we return a 
solution where the probability-mass of scenarios with cost at most d ' B(l + -) is at least 1 — p(l + k + e). 

We can take care of both issues by imposing the following (sufficient) condition on the demand-oblivious 
algorithm for DUFL that is used to obtain an approximation algorithm for SUFL (via the conversion process 
in [38]): we require that with probability 1, the algorithm return an integer solution where each client's 
assignment cost is within some factor of its cost in the fractional solution. One can use the randomized 
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approximation algorithm of Swamy [43] or the deterministic Shmoys-Tardos-Aardal (STA) algorithm [40], 
both of which satisfy this condition. Given a fractional solution (x, y) to DUFL with facility cost F, for a 
parameter 7 £ (0, 1), the STA-algorithm returns an integer solution (x, y) with facility cost is at most F/7, 
where for every j, J2 i CijXij < j-§- • J2i Cij x ij (so for any demands, the total assignment cost is at most 
times the fractional assignment cost). Taking 7 = 5 and applying the rounding procedure of [38] yields 
the following theorem. 

Theorem 4.6 For any e, k, e > 0, there is an (5.52(1 + e + ~), 5.52(1 + e + -), 1 + k + e) -approximation 
algorithm for risk-averse budgeted facility location. 

Remark 4.7 The local approximation algorithm for SUFL developed by [33] is unsuitable for our purposes, 
since this algorithm needs to know explicitly the second-stage fractional solution for each scenario, which 
is an exponential amount of information. 

Budget constraints on individual components of the second-stage cost. Our techniques can be used to 
devise approximation algorithms for various fairly general risk-averse versions of facility location. Since the 
second-stage cost consists of two distinct components, the facility-opening cost and the client-assignment 
cost, one can consider risk-averse budgeted versions of the problem where we impose a joint probabilistic 
budget constraint on the total second-stage cost, and each component of the second-stage cost. That is, 
consider (RAFL-P) with the following additional constraints for each scenario A: ^ ff~yA,i < Bp and 
Ylj i c ij x A,ij < Be- Here Bp and Be are respectively budgets on the per-scenario facility-opening and 
client-assignment costs. To put it in words, (RAFL-P) augmented with the above constraints imposes the 
following joint probabilistic budget constraint: 

Pr^ [total cost of scenario A > B OR facility-cost of scenario A > Bp 

OR assignment-cost of scenario A > Be] < p- 

Note that by setting the appropriate budget to 00 we can model the absence of a particular budget constraint. 
One can model various interesting situations by setting B,Bp, Be suitably. For example, suppose we set 
Bp = and B = 00 (or equivalently B = Be)- Then we seek a minimum-cost solution where we want 
to choose facilities to open in stage I such that with probability at least 1 — p, we can assign the clients in a 
scenario A to (a subset of) the stage I facilities while incurring assignment cost at most Be- One can also 
consider risk-averse robust versions of the problem where we seek to minimize the first-stage cost plus the 
(1 — p)-quantile of a certain component of the second-stage cost (i.e., the second-stage facility-opening, or 
assignment, or total cost). Employing the usual "guessing" trick, this gives rise to a budget problem where 
we have a budget constraint for a single component of the second-stage cost (that is, two of B,Bp and 
Be are set to 00). As before, the guarantees obtained for the budget problem (see below) translate to this 
risk-averse robust problem. 

Our techniques can be used to solve this more general LP. Specifically, Theorem 4.3 continues to hold. 
But here we face the complication that even if we have a first-stage solution x to the fractional risk-averse 
problem for which we know that there exist second-stage feasible solutions that yield a solution of total 
expected cost C, it is not clear how to compute such feasible second-stage solutions. However, notice that 
RiskAlg not only returns a first-stage solution (with the above existence property) but also shows how to 
compute a suitable second-stage solution in each scenario A, which thus, allows us to specify completely 
a near-optimal solution to the LP-relaxation (where the RHS of (6) is p(l + k)). Whereas earlier we used 
these solutions only in the analysis, now they are part of the algorithm. In the rounding procedure, the first 
step, where we convert the solution to the LP-relaxation to a fractional solution to the risk-averse problem is 
unchanged. But we of course now need a stronger notion of "locality" from our approximation algorithm for 
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SUFL. We need an algorithm that approximately preserves (with probability 1) both the facility-opening and 
client-assignment components of the second-stage cost of each scenario. (Clearly, if the budget constraint is 
imposed on only one of the components then we only need the cost-preservation of that component.) Many 
LP-rounding algorithms for SUFL (such as the ones in [38, 42]) do in fact come with this stronger local 
guarantee. Thus, one can use these to obtain an approximation algorithm for the above risk-averse problem 
with multiple budget constraints. 

Finally, we obtain the same approximation guarantees with non-uniform scenario budgets {(B A , B A , Bq)}. 
The only small detail here is that in order to obtain the upper bound UB for use in RiskAlg, we now de- 
termine if Pt[Ca > min{B A , B A }] is greater than p or at most p(l + ||). In the former case, we 
conclude infeasibility, and in the latter, we set p = p(l + k such that p(l + k) = p(l + n), 

32(l+e)(^ maxj Ci ) 

and UB = — 1 — — and run RiskAlg with these values. (Note that we may assume that 

B A < ^ maxj Cjj for all A.) 



5 Sampling lower bounds 

We now prove various lower bounds on the sample size required to obtain a bounded approximation guar- 
antee for the risk-averse budgeted problem in the black-box model. We show that the dependence of the 
sample size on ^ for an additive violation of k in the probability threshold is unavoidable in the black-box 
model even for the fractional risk-averse problem and even if we allow a bounded violation of the budget. 

The crux of our lower bounds is the following observation. Consider the following problem. We are 
given as input a threshold g £ (0, \) and a biased coin with probability q of landing heads, where the coin is 
given as a black-box; that is, we do not know q but may toss the coin as many times as necessary to "learn" 
q. The goal is to determine if q < g or q > 2g; if q G (g, 2g] then the algorithm may answer anything. We 
prove that for any 5 < i, any algorithm that ensures error probability at most 5 on every input must need at 
least Af(8; g) = ln(4 — l) /4g coin tosses for each threshold g. 

Lemma 5.1 Let 5 < \ and An(5-q) be an algorithm that has failure probability at most 5 and uses at most 
N(6; g) coin tosses for threshold g. Then, N(5; g) > Af(5; g) for every g £ (0, \). 

Proof : Suppose N(5; g) < Af(6; g) for some g £ (0, \). Let X be a random variable that denotes the 
number of times the coin lands heads. If X = then the algorithm must say "g < g" with probability at 
least 1 — 5, otherwise the algorithm errs with probability more than 8 on q = 0. But then for some qo < \ 
slightly greater than 2g, we have Pr[X = 0] > (1 - 2g) Af{ - 5 '^ > So A will say "q < g" (and hence, 
err) for q = q , with probability more than 5. ■ 

As a corollary we obtain that for any 8 < |, it is impossible to determine if q = or if q > with error 
probability at most 5 using a bounded number of samples. 

Now consider risk-averse budgeted set cover. We say that a solution is an (e, 7)-optimal solution if its 
cost is at most (1 + e) OPT + 7. Suppose there is an algorithm A for risk-averse budgeted set cover that on 
any input (with a black-box distribution) draws a bounded number of samples and returns an (e, 7)-optimal 
solution with probability at least 1 — 5, 5 < i, where the probability-threshold is violated by at most k. 
Consider the following risk-averse budgeted set-cover instance. There are three elements e\,e2,e%, three 
sets Si = {ej}, z = 1,2, 3. The budget is B > 67 and the probability threshold is p < gp^pij- ^ ne costs 
are iu l s _ = B for all i, and Wg = 0, w A 2 = w A 3 = 2B/3 for every scenario A. Let n < \- There are 3 
scenarios: A = 0, A ± = {ei, e 2 , e 3 }, A 2 = {e 2 , e 3 } withp Al = p- k, p A3 = ^-VA X ~PA 2 - Observe that 
if PA 2 < K > th en OPT < p ■ 45/3, and every (e, 7)-optimal solution must have xsj + x$ 2 + %s 3 < 3- But 
if pa 2 > 2k (which is possible since p < 1) then any solution where the probability of exceeding the budget 
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is at most p + k must have x$ 2 + %s s > ^> otherwise the cost in both scenarios A\ and A2 will exceed 
5. Thus, algorithm A can be used to determine if pa 2 < k or > 2k. This is true even if we allow the 
budget to be violated by a factor c < ^ since we must still have xs 2 + xs 3 > § if PA 2 > 2k; choosing 
B > 1, jj < 1, we can allow an arbitrarily large budget- violation. So since A has failure probability at 
most 5, by Lemma 5.1, it must draw O(^) samples. 

Taking k = shows that obtaining guarantees without violating the probability threshold is impossible 
with a bounded sample size, whereas taking k = Kp shows that a multiplicative (1 + ft)-factor violation of 
the probability threshold requires ^(^) samples. Moreover, taking p = shows that one cannot hope to 
achieve any approximation guarantees in the (standard) budget model with black-box distributions. 

Theorem 5.2 For any e, 7 > 0, 6 < \, every algorithm for risk-averse budgeted set cover that returns an 
(e, ^-optimal solution with failure probability at most 5 using a bounded number of samples 

• must violate the probability threshold on some input; 

• requires ^(^) samples if the probability-threshold is violated by at most an additive k; 

• requires O ( ^ ) samples if the probability-threshold is violated by at most a multiplicative ( 1 + n)-f actor. 

The proof of impossibility of approximation in the standard robust model with a bounded sample size 
is even simpler. Consider the following set cover instance. We have a single element e that gets "activated" 
with some probability p; the cost of the set S = {e} is 1 in stage I and some large number M in stage II. 
If p = then OPT = 0, otherwise OPT = 1. Thus, it is easy to see that an algorithm returning an (e, 7)- 
optimal solution can be used to distinguish between these two cases (it should set xs < 7 in the former 
case, and xs sufficiently large in the latter). 

References 

[1] C. Acerbi and D. Tasche. On the coherence of expected shortfall. Journal of Banking and Finance, 
26,1487-1503, 2002. 

[2] E. M. L. Beale. On minimizing a convex function subject to linear inequalities. Journal of the Royal 
Statistical Society, Series B, 17:173-184; discussion 194-203, 1955. 

[3] D. Bertsimas and M. Sim. The price of robustness, INFORMS Journal on Operations Research, 
52:35-38, 2004. 

[4] J. R. Birge and F. V. Louveaux. Introduction to Stochastic Programming. Springer- Verlag, NY, 1997. 

[5] J. Borwein and A. S. Lewis. Convex Analysis and Nonlinear Optimization. Springer- Verlag, NY, 2000. 

[6] G. Calafiore and M. Campi. The scenario approach to robust control design. IEEE Transactions on 
Automatic Control, 51(5):742-753, 2006. 

[7] M. Charikar, C. Chekuri, and M. Pal. Sampling bounds for stochastic optimization. Proceedings, 9th 
RANDOM, pages 257-269, 2005. 

[8] A. Charnes and W. Cooper. Uncertain convex programs: randomized solutions and confidence levels. 
Management Science, 6:73-79,1959. 

[9] V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of Operations Research, 
4:233-235, 1979. 



21 



[10] G. B. Dantzig. Linear programming under uncertainty. Management Science, 1:197-206, 1955. 



[11] K. Dhamdhere, V. Goyal, R. Ravi, and M. Singh. How to pay, come what may: approximation algo- 
rithms for demand-robust covering problems. Proceedings, 46th Annual IEEE Symposium on Founda- 
tions of Computer Science, pages 367-378, 2005. 

[12] S. Dye, L. Stougie, and A. Tomasgard. The stochastic single resource service-provision problem. 
Naval Research Logistics, 50(8):869-887, 2003. Also appeared as "The stochastic single node ser- 
vice provision problem", COSOR-Memorandum 99-13, Dept. of Mathematics and Computer Science, 
Eindhoven Technical University, Eindhoven, 1999. 

[13] E. Erdogan and G. Iyengar. On two-stage convex chance constrained problems. Math. Methods of 
Operations Research, 65(1): 115-140, 2007. 

[14] U. Feige, K. Jain, M. Mahdian, and V. Mirrokni. Robust combinatorial optimization with exponential 
scenarios. Proceedings, 13th IPCO, pages 439^153, 2007. 

[15] A. Goel and R Indyk. Stochastic load balancing. Proceedings, 40th Annual IEEE Symposium on 
Foundations of Computer Science, pages 579-586, 1999. 

[16] D. Golovin, V. Goyal, and R. Ravi. Pay today for a rainy day: improved approximation algorithms for 
demand-robust min-cut and shortest path problems. Proceedings, 23rd STACS, pages 206-217, 2006. 

[17] A. Gupta, M. Pal, R. Ravi, and A. Sinha. Boosted sampling: approximation algorithms for stochastic 
optimization. Proceedings, 36th Annual ACM Symposium on Theory of Computing, pages 417-426, 
2004. 

[18] A. Gupta, M. Pal, R. Ravi, and A. Sinha. What about Wednesday? Approximation algorithms for 
multistage stochastic optimization. Proceedings, 8th APPROX, pages 86-98, 2005. 

[19] A. Gupta, R. Ravi, and A. Sinha. An edge in time saves nine: LP rounding approximation algo- 
rithms for stochastic network design. In Proceedings, 45th Annual IEEE Symposium on Foundations 
of Computer Science, pages 218-227, 2004. 

[20] A. Hayrapetyan, C. Swamy, and E. Tardos. Network design for information networks. Proceedings, 
16th SODA, pages 933-942, 2005. 

[21] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American 
Statistical Association, 58:13-30, 1963. 

[22] N. Immorlica, D. Karger, M. Minkoff, and V. Mirrokni. On the costs and benefits of procrastination: 
approximation algorithms for stochastic combinatorial optimization problems. Proceedings, 15th An- 
nual ACM-SIAM Symposium on Discrete Algorithms, pages 684-693, 2004. 

[23] P. Jorion. Value at Risk: A New Benchmark for Measuring Derivatives Risk. Irwin Professional 
Publishers, New York, 1996. 

[24] J. Kleinberg, Y. Rabani, and E. Tardos. Allocating bandwidth for bursty connections. SIAM Journal 
on Computing, 30(1):191-217, 2000. 

[25] A. J. Kleywegt, A. Shapiro, and T. Homem-De-Mello. The sample average approximation method for 
stochastic discrete optimization. SIAM Journal of Optimization, 12:479-502, 2001. 



22 



[26] M. Mahdian, Y. Ye, and J. Zhang. Approximation algorithms for metric facility location problems. 
SI AM Journal on Computing, 36:411^132, 2006. 

[27] H. M. Markowitz. Portfolio selection. Journal of Finance, 7:77-91, 1952. 

[28] A. Nemirovski and A. Shapiro. Scenario approximations of chance constraints. In G. Calafiore and 
F. Dabbene, editors. Probabilistic and Randomized Methods for Design under Uncertainty, Springer- 
Verlag, 2005. 

[29] A. Prekopa. Contributions to the theory of stochastic programming. Mathematical Programming, 
4:202-221, 1973. 

[30] A. Prekopa. Stochastic Programming. Kluwer Academic Publishers, Dordrecht, 1995. 

[31] A. Prekopa. Probabilistic programming. In A. Ruszczynski and A. Shapiro, editors, Stochastic Pro- 
gramming, volume 10 of Handbooks in Operations Research and Mgmt. Sc., North-Holland, Amster- 
dam, 2003. 

[32] M. Pritsker. Evaluating value at risk methodologies. Journal of Financial Services Research, 
12(2/3):201-242, 1997. 

[33] R. Ravi and A. Sinha. Hedging uncertainty: approximation algorithms for stochastic optimization 
problems. Mathematical Programming, Series A, 108:97-114, 2006. 

[34] R. Rockafellar and S. Uryasev. Conditional value-at-risk for general loss distributions. Journal of 
Banking and Finance, 26:1443-1471, 2002. 

[35] A. Ruszczynski and A. Shapiro. Editors, Stochastic Programming, volume 10 of Handbooks in Oper- 
ations Research and Mgmt. Sc., North-Holland, Amsterdam, 2003. 

[36] A. Ruszczynski and A. Shapiro. Optimization of risk measures. In G. Calafiore and F. Dabbene, 
editors. Probabilistic and Randomized Methods for Design under Uncertainty, Springer- Verlag, 2005. 

[37] A. Shapiro. Monte Carlo sampling methods. In A. Ruszczynski and A. Shapiro, editors, Stochas- 
tic Programming, volume 10 of Handbooks in Operations Research and Mgmt. Sc., North-Holland, 
Amsterdam, 2003. 

[38] D. B. Shmoys and C. Swamy. An approximation scheme for stochastic linear programming and its 
application to stochastic integer programs. Journal of the ACM, 53(6):978-1012, 2006. 

[39] D. B. Shmoys and C. Swamy. Stochastic optimization is (almost) as easy as deterministic optimization. 
Proceedings, 45th Annual IEEE FOCS, pages 228-237, 2004. 

[40] D. B. Shmoys, E. Tardos, and K. I. Aardal. Approximation algorithms for facility location problems. 
Proceedings, 29th Annual ACM Symposium on Theory of Computing, pages 265-274, 1997. 

[41] A. M-C. So, J. Zhang, and Y. Ye. Stochastic combinatorial optimization with controllable risk aversion 
level. Proceedings, 9th APPROX, pages 224-235, 2006. 

[42] A. Srinivasan. Approximation algorithms for stochastic and risk-averse optimization. Proceedings, 
18th SODA, pages 1305-1313, 2007. 

[43] C. Swamy. Approximation Algorithms for Clustering Problems. Ph.D. thesis, Cornell University, 
Ithaca, NY, 2004. http://www.iTiath.uwaterloo.ca/--cswamy/theses/master.pdf. 



23 



[44] C. Swamy and D. B. Shmoys. Approximation algorithms for 2-stage stochastic optimization problems. 
ACM SIGACTNews, 37(1):33^16, March 2006. Also appeared in Proceedings, 26th FSTTCS, pages 
5-19, 2006. 

[45] C. Swamy and D. B. Shmoys. Sampling -based approximation algorithms for multi-stage stochastic op- 
timization. http://www.math. uwaterloo.ca/— cswamy/papers/multistage-journ.pdf. Preliminary version 
in Proceedings, 46th Annual IEEE Symposium on Foundations of Computer Science, pages 357-366, 
2005. 

A A bicriteria approximation for the Shmoys-Swamy class of 2-stage stochas- 
tic LPs in the standard budget model (p = 0) 

Here we sketch how one can obtain a bicriteria approximation algorithm for the class of 2-stage LPs intro- 
duced in [38] in the standard budget model (that is, where we have a deterministic budget constraint). We 
show that for any p > 0, in time inversely proportional to p, one can obtain a near-optimal solution where 
the total probability-mass of scenarios where the budget is violated is at most p. We consider the following 
class of 2-stage stochastic LPs [38] 2 in the standard budget model. 

min h{x) = w l ■ x + ^p A f A {x) subject to x G V C M"\ f A (x)<B for all A (Stoc-P) 

A 

where f A (x) = min w A ■ ta + Q A ■ s A 

s.t. D A s A + T A r A > j A - T A x 

ta,s a > 0, r A G K m , s A G R £ . 

Here (a) T A > for every scenario A, and (b) for every x G V, ^ A ^ A V 'a] a(x) > and the primal and 
dual problems corresponding to f A (x) are feasible for every scenario A. It is assumed that V C B(0, R), 
and that V contains a ball of radius V (V < 1) where ln(y) is polynomially bounded. Define A = 

max(l, max^gy^s — f ); we assume that A is known. Let OPT be the optimum value and X denote the 
input size. 

It is possible to adapt the proofs in [38, 7, 44] to obtain the bicriteria guarantee and one can also prove 
an SAA theorem in the style of [45, 7]. But perhaps, the simplest proof, which we now describe, is obtained 
using the ellipsoid-based algorithm in [38]. Let V' = {x G V : f A (x) < B for all ^4}. Note that unlike in 
the case where we have a probabilistic budget constraint, V' is a convex set. 

Consider running the ellipsoid-based algorithm in [38] with the following modification. Suppose we 
wish to return a solution of value at most (1 + e)OPT + 7. Let N = poly (m, ln(-p^)) be a suitably 
large value that is equal to the number of iterations of the ellipsoid method. Let p' = p/N. Suppose the 
center of the current ellipsoid is x G V. Using 0(^7) samples one can determine with high probability 
if YT A \j A {x) > B] > p' /2 or if Pr A [f A (x) > B] < p' . In the former case, by sampling again 0(J/) 
times, with very high probability, we can obtain a scenario A such that f A (x) > B. Now we compute 
a subgradient d A)X of f A {.) (which is obtained from an optimal dual solution to f A (x)) at x, and use the 
inequality d A ,x • (y — x) < to cut the current ellipsoid. Notice that this is a valid inequality since for any 
y G V , by the definition of a subgradient, we have < f A (y) — f A {x) > d A:X ■ (y — x). In the latter 
case, where we detect that Pv A [f A (x) > B] < p' , we continue as in the algorithm in [38]: we mark the 
current point x and use an approximate subgradient of h{.) at x to cut the current ellipsoid. Proceeding this 

2 This was stated in [39] with extra constraints B a sa > h A , but this is equivalent to (^a)sa + ( T °4) r A > Qa) — (ta) x - 
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way we obtain a collection of marked points x%, . . . , x^, where k < N, such that with high probability, 
Pr J 4[/yi(xj) > B] < p' for each Xi, and by the analysis in [38] we have that min^ h(xi) is "close" to OPT. 

The next step in the algorithm in [38] is to find a point in the convex hull of x\, . . . , x^ whose value 
is close to minj h(xi) (procedure FindMin). Notice that for any point y in the convex hull of x±, . . . , Xk, 
we have Pr^f/^y) > B] < kp' < p: for any scenario A with /^(xi) < B for all i, the convexity of 
/a(-) implies that /^(y) < B. Thus, although the set {x G V : Pta[/a(x) > B] < p} is not convex, this 
does not present a problem for us. So one can use procedure FindMin in [38] to return a point y such that 
h(y) < (1 + e)OPT + 7 where Pr A [f A (y) > B] < p. 
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